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АВЗТВАСТ 


The provision of comprehensive services by a complex 
modern computer installation is expensive. In the face of 
increasing demand for computer service, system expansion 
may be proposed. This expansion may not be necessary if 
existing resource utilization can be increased or more 
equally distributed. 

This research investigates the possibility of increased 
system throughput through a balancing of the demand on the 
individual modules of an IBM 2314 Disk Facility. The per- 
formance of the disk modules is measured utilizing a 
hardware monitor. The hardware monitor is also used to 
obtain system performance profiles. 

Comparison of system PONO hput is made during times 
when different sets of resources are avilable. Recommendations 
are made to improve system performance by rearranging the 


data sets on the disk modules. 
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I. INTRODUCTION 


The high cost of providing comprehensive services in a 
modern computer installation motivates the manager to reduce 
the costs or at least maintain costs at a constant level 
during a time of continually increasing demands for service. 
If the computer system is not completing its assigned tasks 
according to the required schedule, expansion of the system 
capacity may be proposed. For example, additional core 
storage may be added, a faster, greater capacity auxiliary 
storage device may be substituted for an existing device, 
faster input/output peripheral devices may be obtained, or 
even the CPU itself may be upgraded. All of these alternatives 
involve significant financial expenditure, but another 
alternative exists which may be less expensive. That is to 
continue to utilize the same equipment, but to increase the 
effective utilization of this ee to meet the increased 
demand for computational power. 

In order for this last alternative to be selected, its 
feasibility must be determined and before making this 
determination one must measure the present performance of 
the system. 

The measurement of the performance of a complex computer 
system is difficult, but the potential rewards are significant 
and well documented (References 1 and 2). In addition to the 


improvements in performance made possible by performance 


er 





measurement, the measurements provide a basis for future 
decisions on configuration changes: and system expansion. 

This research is directed at the performance measurement 
of an IBM 2314 Disk Facility and its associated selector 
channel. The performance of individual disk modules is 
measured and the resulting data is analyzed. Recommendations 


are presented to improve system performance, 






ТТ. BACKGROUND AND OBJECTIVES 


Knowing what to measure with a hardware monitor is 
difficult. ТЕ measurements are too gross, specific recommen- 
dations for change are difficult or impossible to formulate, 
while if measurements are very detailed they may form the 
basis for a recommendation to improve utilization of one 
component of a system without considering the concurrent 
effects on the overall system performance. 

This research is part of a continuing effort to determine 
precisely how to identify and measure the work accomplished 
by and. performance of a complex modern computing system. 

The very definitions of the terms "work", "performance", 
and "computer power" are under discussion and subject to 
efforts for шөке ipe Н definition (Reference 3). 

Hanke has made measurements on the IBM 360 Model 67 

installed at the Naval Postgraduate School (Reference 4). 

He reported a large percentage of CPU wait only time (CPU 

in wait state and selector channel 2 not busy). One possible 
cause of this is large disk arm seek time, i.e., the CPU 

and selector channel are both waiting for the disk arm to 

move to some other disk track. The primary objective of this 

research was to measure the performance of the IBM 2314 

Disk Facility and its associated selector channel to determine 
the percentage of time spent by the CPU and selector channel 


both waiting for the disk arm to move to another track, then 






to determine if this arm seek time accounted for the majority 
of the CPU wait only time. In addition to specific measure- 
ments of the 2314 Disk Facility, it was desirable to record 
broad system performance profiles to determine if CPU wait 
only continued to be significantly high. 

Some improvements were recommended in the data reduction 
and analysis programs written by Hanke (Reference 4). These 
improvements included the addition of the ability to plot 
the output from the hardware monitor and the зайлан of a 
date check in the program SMF Graph. Some improvements in 
the statistical analysis of data to determine means, variances 
and correlations was also required. 

Thus, the overall objective of the research was to 
combine a specific performance measurement experiment with 
supporting data reduction and analysis in order to make a 
specific recommendation for improvement in system perfor- 
mance. The objective of this thesis is to report the results 


of this research and to suggest areas for further research. 





III. EXPERIMENTAL PROCEDURE 


А. MEASUREMENT ENVIRONMENT 

The computer system under investigation is an IBM 360 
Model 67 with configuration as shown in Figure l. The 
system was operated in a simplex mode (single CPU) with 
768K bytes of core storage for 20 hours per weekday and as 
a split system (separate operating systems on the two CPU's) 
for four hours per day. On Saturdays and Sundays the system 
is run from 0800-2000 in a simplex mode. While a split 
system is operating from 1200 to 1600 each weekday, part of 
the computer resources are assigned to a time-sharing 
system, CP/CMS (Cambridge Monitor System). The major change 
in resources available for batch processing operation includes 
the loss of 256K bytes of core storage and the 2301 drum. 
Detailed allocation of resources during the four hours of 
time-sharing is shown in Figure 2. 

The operating system under investigation is 05/360 МУТ 
(Multiprogramming with a Variable number of Tasks). (The 
operation of the CP/CMS time-sharing system was not measured 
as part of this research.) During the twenty hours per day 
without time-sharing, 768K bytes of core storage are available 
with 478K bytes available for the execution of problem 
programs and the remaining 290K bytes for use by the operating 
system. The use of 256K bytes by the time-sharing system 


leaves 222K bytes for problem programs during the 1200-1600 
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Naval Postgraduate School IBM 360 Model 67 


Figure 1. 
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DEVICE OS/MVT CP/CMS 


CPU 2067-2 Х 
СРО 2067-2 X 
PRINTER KEYBOARD 1052-7 X 
PRINTER KEYBOARD 1052-7 X 
CORE STORAGE 2365-12 X 
CORE STORAGE 2365-12 X 
CORE STORAGE 2365-12 X 
DRUM STORAGE 2301 X 
DISK STORAGE 2311 (8) X 
DISK STORAGE 2314 X 
TAPE UNITS 2402-1 X X 
CARD READER 2501-B2 је 
CARD READ PUNCH 2540 X 
PLOTTER 765 (2) X 
PRINTER 1403-N1 (2) | Х 
CHANNEL CONTROLLER 2846-1 (2) X X 


Computer Resource Allocation Under CP/CMS 


Figure 2. 


ЇГ 





time period. (Since this research was completed some parts 
of the operating system, namely the resident SVC's, have been 
made non-resident and this increased the usable core to 260K 
bytes.) 

Operating policy also varies with the time of day. The 
primary objective of the operating policy is to give quick 
turnaround for small, short jobs (€ 100K bytes, $20 seconds 
СРО time). No particular attempt is made to balance the 
workload of the system, i.e., control the job mix to 
execute both I/O bound and compute bound jobs at the same 
time. Control of the job mix would be difficult as job entry 
is by way of a user operated hot card reader. Job classes 
are defined to give the highest priority to the small, short 
jobs. The use of QUICKRUN (Reference 5) as a sub-system of 
the operating system is also highly favorable to the small 
jobs, generally providing "instant" turnaround (less than five 
minutes) for the small jobs. QUICKRUN is a job management 
system which processes problem programs faster than OS/MVT 
by reducing the operating system overhead associated with 
each job. Restrictions on jobs eligible to be run under 
QUICKRUN include less than 100K bytes, less than 20 seconds 
of CPU time, no use of tape, and less than 1000 lines of 
printed output. 

Job arrivals are heavily concentrated in the afternoon 
with the peak load usually coming between 1400-1600. During 
the month of March 1972 when these measurements were taken, 


24,500 jobs were processed; of these 11,700 were under QUICKRUN. 


12 





В. EXPERIMENTAL DESCRIPTION 

The primary measuring device used in this research was 
the Measurement Engine, a hardware monitor manufactured by 
Boole and Babbage, Inc.. The use of the Measurement Engine 
in system performance measurement and analysis is described 
in References4 and 6. The Measurement Engine is actually a 
hardware monitor system with many different possible con- 
figurations. As used for these experiments, the configuration 
consisted of two ME-1011 Event Monitors and one ME-2011 Paper 
Tape Printer, all owned by this institution. Each Event 
Monitor can receive signals from eight probes attached to the 
host computer. The probe signals may then be combined on a 
user wired logic plugboard which has AND, NOR, INVERTER, and 
FLIPFLOP capabilities. The outputs from the logic plugboard 
are then routed to the six counters and the paper tape printer. 
Logic signals may be routed between Event Monitors which may 
be stacked one upon the other. 

To obtain the nine signals shown in Figure 3, nine probes 
were connected to the appropriate computer pins also shown 


in Figure 3.. 


SIGNAL DEVICE PIN 
CPU manual 2067 EC2H4B09 
CPU wait 2067 EC2J6B07 
Channel 2 busy 2360 BA3D6D04 
MVTREX disk arm seek 2314 AA3H4DL1* 
MVTLNX disk arm seek 2314 AA3H4D11 
LINDA disk arm seek 2314 AA3H4D11 
SPOOL 1 disk arm seek 2314 AA3H4D11 
SPOOL 2 disk arm seek 2314 AA3H4D11 
SPOOL 3 disk arm seek 2314 AA3H4D11 


*This pin is probed on each module measured. 


Figure 3. Hardware Monitor Signal Probe Connections 
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The input signals were combined using the logic board 


capability of the Event Monitor. A diagramatic representation 


of the logic is shown in Figure 4. 


The resulting signals 


representing the ten events shown in Figure 5 were accumulated 


by the counters of the two Event Monitors and at preselected 


time intervals were recorded by the Paper Tape Printer. These 


paper tape data were then keypunched to be used as input 


to the program Hardware Graph (Reference 4), which presents 


a bar graph for each event for each time interval. 
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The conditions of each experiment are summarized in 


Figure 6, but it is appropriate here to discuss some of the 


reasons for conducting the experiments under these conditions. 
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In order to insure robustness of results, it was desired 
to conduce човек case experiments and analysis. The tenth 
and eleventh weeks of a twelve week academic quarter were 
chosen as appropriate times for measurements due to the 
historically heavy workload during these two weeks. 

It was also desired to compare system performance during 
the time periods when the time-sharing system CP/CMS was 
being utilized against the periods when О5/МУТ was operating 
exclusively. This dictated that the afternoon be included. 
Also, the highest job arrival frequency is during the 
afternoon. 

For the first two days! experiments (1 and 2), a time 
interval of 15 minutes was chosen in order to determine the 
range of values over relatively short time intervals. There 
were no wide fluctuations during the 15 minute intervals so 
30 minute intervals were chosen for the remaining experiments. 
The 60 minute interval was chosen for the final experiment 
due to the failure of the paper tape printer. The hardware 
monitor holds the accumulated utilization values in a buffer 
For output to the paper tape printer until the next time 
interval has elapsed. This allows the experimenter to hand 
record the values in the buffer just before the end of a time 
interval and just after the end of a time interval. By 
recording data from two time intervals, the experimenter 
may then by physically absent from the hardware monitor for 
slightly less than two more time intervals. For example, 


by using the 60 minute interval, one may be absent from the 


Py 





hardware monitor for about 1 hour and 50 minutes of every 
2 hours without losing any data. It is felt that these 
different time intervals do not significantly affect the 
results reported herein. 

System performance was monitored for a total of 66 
hours, ОЕ this time there were 768K bytes of core storage 
available to the system for 50 hours. For 16 hours 512K 
bytes of core storage were available to the system as 256K 
bytes of core storage and the 2301 drum were being utilized 
by the time-sharing system. The 66 hours of measurement 
time were divided into 46 hours during weekdays and 20 hours 


during the weekend. 
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IV. DISCUSSION OF RESULTS 


A. DISK MODULE PERFORMANCE 

The six modules of the IBM 2314 Disk Facility whose 
performance was measured are known by the names MVTREX, MVTLNX, 
LINDA, SPOOL 1, SPOOL 2, and SPOOL 3. Two other user disk 
modules named MARY and DUFFY were not measured because their 
activity is much lower than those measured. In this discussion 
the comparisons involve the condition when the CPU is in the 
Wait state and the selector channel is not busy and a disk 
arm is seeking (moving to another track). This condition 
will be referred to, for example, as MVTREX seek without 
repeating the CPU wait and channel not busy qualifiers. 

System performance profiles (Figure 7) show that the 
CPU wait percentage had a wide range of variation varying 
from 0 to 86 percent. Averaged over the seven experiments 
the mean CPU wait was 5l percent. The CPU wait only (СРО 
wait and selector channel 2 not busy) averaged over the 
Seven experiments ranged from 6 to 55 percent with a mean 
of 26 percent. Thus, on tbe average, the CPU is idle half 
the time and of this CPU idle time about half the time the 
channel is also idle. One condition that may cause both 
the CPU and channel to be waiting is a disk arm seeking 
(moving to another track). Data from the seven experiments 
showed that the module MVTLNX had more arm seek time than 


the other five disk modules measured (Figure 8). The ratio 
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of MVTLNX seek to the other disk modules ranged from 1.9:1 
to 12.0:1. The ratio of MVTLNX seek to the mean disk seek 
was 2.65:1 averaged over the seven experiments. Thus, there 
was an unbalanced demand placed on this one disk module, 
MVTLNX. 

What then were the contents of this disk module which 
may have caused this imbalanced demand? MVTLNX is a system 
module with three particular data sets of interest. The 
most active data set on MVTLNX was the operating system job 
queue. This job queue data set is allocated 30 cylinders 
(4.3 million bytes) of space which is referenced by many 
parts of the operating system. The job queue must be accessed 
by the reader program, the initiator program, the writer 
program, and by the display commands issued by the 
operator's console, for a minimum of between 6 and 16 
accesses per job. 

Another significant data set on the MVTLNX module is the 
link library. This data set is allocated 50 cylinders (7.2 
million bytes) of space. The link library contains the 
executable modules for the reader program, the writer program, 
and the initiator program. It also contains the language 
processing modules (FORTRAN G, FORTRAN H, PL/I, COBOL, and 
RPG), non-resident operating system modules, supervisor calls 
(SVC) and input/output error recovery modules. This data 
set must be accessed a minimum of 3 to 5 times for each job 
execution. 

The third data set of interest in the module MVTLNX is 


used for recording accounting data. System Management 
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Facilities (SMF) information is written into this third 

data set. SMF is an optional feature of OS/MVT which 
records system and job performance information. (Use of SMF 
as a software monitor is explained by Hanke (Reference 4).) 
In particular, job start and stop times, CPU times used and 
identification data are recorded for each job step upon 
completion of the job step. This data set is thus accessed 
at least three times on an average, non-QUICKRUN job (once 
per job step). 

This one disk module MVTLNX therefore contains three 
data sets which must be accessed between 12-24 times for each 
job execution. This would be the case for a typical FORTRAN 
compile, link-edit and execute job, which account for about 
half of all jobs submitted, not including those FORTRAN jobs 
run under QUICKRUN. 

The questions arise as to which of the data sets could 
be transferred from MVTLNX to another location, where the 
data set could be relocated, and what effect the stu я 
would have on system performance. The first data set 
examined was the job queue. The job queue is presently 
allocated 30 cylinders (4.3 million bytes) of space with a 
resulting capacity of about 150 jobs. Assuming that the 
accesses to the job queue account for between 50-67 percent 
of the disk seek activity on MVTLNX, and that the mean 
MVTLNX seek is 8.52 percent, then the job queue seek is from 
0r 30 "= 8,52 4.3 to 0.67 * 8.52 = 5.7 percent. Considering 


that 20 hours per day the system is run with no time-sharing 
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(i.e., with the 2301 drum available), then from 52 minutes 
to 69 en (20 hours * 0.057 = 1.14 hours = 69 minutes, 
20 hours * .043 = .86 hours = 51.6 minutes) per day is spent 
waiting for access to the job queue. 

Now suppose the job queue were placed on the 2301 drum. 
Arm seek delay would be nonexistant and although there would 
be some delay in the form of rotational delay, the delay would 
be less than the rotational delay of the 2314 disk. The 
improvement gained would be at most .86/20 hours = 4.3 
percent or 1.14 hours/20 hours = 5.7 percent. The 2301 drum 
is used for four hours per day in support of the CP/CMS 
time-sharing system and therefore a utility program would be 
needed. to transfer the job queue from the 2314 disk to the 
2301 drum and back again at the conclusion of the time-sharing 
period. This transfer of the job queue would also require 
a reformatting of the job queue to coincide with the recording 
techniques used on the 2301 drum. The required utility pro- 
gram does not exist and one NPS system programmer suggested 
that it would be very difficult to write. Disadvantages 
in moving the job queue from the disk to drum and back 
include operator inconvenience, time required for transfer, 
and possible error and subsequent loss of the job queue. 

Another alternative would be to move the job queue to 
another disk module. Currently there would have to be an 
examination of the other disk modules to determine which 


data sets should be moved to make room for the job queue, 
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as there is not sufficient empty space on the other modules 

to relocate the job queue. The effect on performance would 

be difficult to estimate, however, this task would have 
significant value of serving as a basis for comparison with 

a repeated conduct of the same experiments after the job 

queue had been relocated. (This possibility is discussed later.) 

What other data set might be moved? The link library 
is currently allocated 50 cylinders (7.2 million bytes) of 
Space which is about twice the eapacity of the 2301 drun. 
Similar comments to those about moving the job queue to 
another disk module apply to moving the link library to 
another disk module. 

This leads one to consider the System Management Facilities 
(SMF) Мосс sets. Two data sets, SYS1.MANX and SYS1L.MANY are 
utilized for recording SMF data. Two data sets are used so 
that when one data set is full the recording is switched to 
the other data set. Another data set on the disk module 
MVTLNX is named SYSL.SMFTUB. THe dar from SYS1.MANX or 
SYS1L.MANY is tranferred to SYSL.SMFTUB as each is filled. 
Later SYS1.SMFTUB is transferred to magnetic tape. When the 
transfers from SYS1.MANX or SYS1.MANY to SYSL.SMFTUB take 
place, the disk arm must move back and forth on the same 
disk module, the same disk module which already is the most 
active. This occurs about once per day and the transfer is 
usually done on the 0000-0800 shift to minamize the semtect 
of disk arm interference on system performance. 

Assuming that the SMF recording is 12.5 to 25 percent of 


the activity on the disk module MVTLNX and using the mean of 
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8.52 percent MVTLNX seek averaged over the seven experiments, 
SMF recording would account for „25 * 8,52 92913 percent of 
MVTLNX seek. Taking 2.15 percent * 20 hours = 0.430 hours 

= 25.8 minutes per day spent waiting for the disk arm to move 
to another track in order to record SMF data. Considering 
that this time also causes contention with the job queue and 
link library activity, it would be advantageous to record 
SMF data on one of the more lightly used disk modules. 
Elimination of the 25.8 minutes SMF time would represent 

at most .43 hours/20 hours = 2.15 percent improvement in 

the activity, performance, or 1.07 percent if SMF recording 
ПЕ 12.5 регсепі. 

If the two suggested changes in the contents of the 
disk module MVTLNX were made, moving the Job queue to the 
2301 drum and moving the SMF data sets to a less active disk 
module, the total improvement would be at best 5.7 percent + 
2.15 percent = 7.85 percent improvement. Using an average 
of 44 job steps executed per hour this 7.85 percent improve- 
ment would represent 3.45 additional job steps per hour 
throughput or 69 additional job steps per 20 hour day. 

Two implied assumptions affecting disk seek time that 
should be explained here are the order of requests to the 
disk and the distance between active data sets. Since the 
exact order of requests is unknown and the requests do not 
follow any fixed pattern in a multiprogramming environment, 
the assumption of random ordering seems reasonable. The 


three critical data sets - SYS1.JOBQUE, SYSl.LINKLl1B, and 
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SYS1.MANX(Y) - are located contiguously so as to minimize 
the arm seeking delay and thus neglecting the actual distance 


moved and averaging the arm seek times seems reasonable. 


B. SYSTEM THROUGHPUT 

Discussion of disk performance in particular and computer 
system performance in general must be considered in the 
context of system throughput. During the month of March 1972 
when these experiments were performed, the computer center 
processed 24,497 jobs. Of this total number of jobs, 11,681 
were run under the QUICKRUN job management system. Figure 9 
shows the system throughput in terms of jobs completed during 
each hour of the day. Considering the period of time from 
1200-1600 the average number of jobs completed per hour is 
1933 whereas for the next busiest four hours (1000-1200 and 
1600-1800) the average number of jobs completed is 1699. 
ТЕ these averages are normalized to reflect the different 
quantities of problem program core available (222K bytes 
from 1200-1600 and 478K bytes from 1000-1200 and 1600-1800), 
then the throughput per unit core is even greater during the 
1200-1600 time period while 256K bytes core storage are lost 
to the time-sharing system, Lest one conclude that a reduced 
amount of core storage improves system throughput one must 
consider the different operating policies in effect during 
these two different time periods» 

During the 1200-1600 time period when only 222K bytes 
of core storage are available, only small, short jobs, 100K 


bytes or less,20 seconds CPU time or less, are allowed to be 
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Hour Ending Hour Ending 


at Time Jobs % Total at Time Jobs % Total 
0100 434 1.8 1300 1835 7.5 
0200 277 PN 1400 1782 7.3 
0300 173 097 1500 2051 8.4 
0400 150 0.6 1600 2065 8.4 
0500 117 0.5 1700 1824 7.4 
0600 110 0.4 1800 1777 7208 
0700 104 0.4 1900 1139 4.6 
0800 22 0.3 2000 1022 4.2 
0900 559 2.3 2100 1139 4,6 
1000 ars 5.4 2200 1261 5.1 
1100 1749 Toe 2300 1153 4.7 
1200 1495 6.1 2400 946 3.9 
TOTAL 24,497 100% 
QUICKRUN 11,681 47 „57, 


March 1972 System Throughput 


Figure 9. 
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run. This control is obtained by a combination of two 
factors. First, job classes are defined to segregate these 
jobs into one class and secondly, the operator controls the 
starting of initiator programs to run only this one class of 
jobs. Thus, the operating policy favors the predominant 

job type, giving fast turnaround to these jobs and operating 
within the core storage limitations imposed by the loss of 
256K bytes of core storage for use by the time-sharing 
system. This operating policy discriminates against larger, 
longer jobs and also has an effect on system utilization. 
There is not a mix of 1/0 bound jobs and compute bound jobs 
during this time period so that CPU utilization decreases 


while 1/0 activity increases (Figure 10). 


C. PLOTTING AND STATISTICAL ANALYSIS PROGRAMS 

Some improvements were recommended in the data reduction 
and analysis programs written by Hanke (Reference 4). A 
program, Hardware Graph, processes data from the hardware 
monitor by reading keypunched data cards and producing bar 
graphs for each event monitored. It was desired to plot 
multiple events on one graph so that the analyst might be 
able to determine trends or possible interaction between 
various events. The plotting program listed in Appendix A 
is adapted from the locally obtained program STPLOT. By 
changing a FORTRAN READ statement and corresponding FORMAT 
statement, the user may plot various combinations of events, 


up to a maximum of ten. The plot is output on the line 
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512K bytes 768K bytes 


EVENT no drum and drum Ratios 

CPU wait 65 L8 47.13 1:739 
CPU wait and channel | 

not busy 29479 2607 1.14 

CPU wait and channel busy 34.61 2306 126 


CPU wait, and channel not 
busy and MVTREX seek a 27522 1298 


CPU walt and channel not 
busy and MVTLNX seek 12.34 7.62 02 


CPU wait and channel not 
busy and LINDA seek 2287 oor 23.9 


CPU wait and channel not 
busy and SPOOL 1 seek 2:26 1.14 1598 


GPU wait and channel not 
busy and SPOOL 2 seek 6.03 2.15 218 


СРО wait and channel not 
busy and SPOOL 3 seek 2.64 2552 18:05 


Comparison of OS/MVT performance 
with 768K bytes vs. 5.2K bytes of Core 


Figure 10. 
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printer and the user must draw lines to connect the points 
corresponding to the events plotted. The rapid türnaround 
for this program makes it very useful for quick visual 
analysis of experimental results. 

Hanke's program, SMF Graph, reads the System Management 
Facilities data from the SMF data set on the disk module 
MVTLNX and provides a summary and some analysis of this 
job stream data. One input parameter to this program is 
time of day when measurement starts. This is adequate to 
locate the desired SMF data if data from only one day is 
currently recorded. Sometimes data from more than one day 
is in the SMF data set in which case the desired data might 
not be-obtained using the original version of SMF Graph. 

An assembly language subprogram was added to SMF Graph to 
require the user to input the desired date as another input 
parameter to SMF Graph and to give SMF Graph the capability 
to check for that date in the SMF data set. 

Statistical analysis of the hardware monitor output was 
performed with the assistance of programs from UCLA's BIMED 
series (Reference 7). These programs provide many standard 
statistical measures such as means, variances, correlations 
with a minimum effort on the part of the user. An example 
of the results of computation for one experiment is shown 


in Appendix C. 


D. FIGURE OF MERIT 
During the course of this research, the question arose 


as to whether the results obtained were typical of those 
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which might be obtained by similar experiments on other 
computer systems. Also what experiments in measuring computer 
performance are in progress at other university computing 
centers? Estrin in Reference 8 states that the results of 
experiments should be reproducible in order to be of any 
value for subsequent generalization. 

For these and other reasons, a survey was designed to 
inquire about the computer performance at other computer 
facilities. Shown in Appendix D, this survey will be sent 
to many installations which use an IBM 360/67 and to many 
other universities. The results will be compiled and made 
available to contributors in an effort toward further 
understanding of computer performance measurement and 
computer system performance optimization. 

One key question in the survey asks, "Is there any one 
overall figure of merit or cni index computed by 
combination of several performance parameters? (Please give 
formula)". The possibility of obtaining a concise answer 
to this question seems sufficiently remote since very little 
research has been done on this problem, although this 
question is currently under study at this institution. If 
there is a valid figure of merit for a computer installation, 
or a computer operating environment, it would certainly be 


of interest and of value to other computer center staffs. 
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V. CONCLUSIONS AND RECOMMENDATIONS 


There are three positive results derived from the conduct 
of this research. First, the actual performance of the 
computer system during a stipulated time period can be stated 
на fact rather than a conjecture; this can be used as a 
basis for future performance comparisons. Secondly, a 
positive recommendation for improvement can Ba made and 
thirdly, the author is now prepared to conduct further per- 
formance evaluation analyses of computer systems. 

The ability to state the performance of a computer system 
as a fáct is valuable to the manager of a computer system. 
Plans and decisions can be based on this factual performance 
data with some level of confidence, which is certainly greater 
than the confidence based on unproven conjectures, In 
addition, future performance measurements can use the results 
reported here as a basis for comparison. Any comparison, 
however, would have to carefully reconsider the measurement 
environment, 

The ability to make a positive recommendation is 
particularly significant. It may be very interesting to 
measure performance of various components of a computer 
system, however if no positive recommendation for improve- 
ment can be made the effort expended in measurement is wasted. 
The recommendation from this research is to move the 


operating system job queue data set and the System Management 
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Facility (SMF) recording data set to a more lightly used 
disk module on the 2314 Disk Facility. Using three disk 
modules for storage of operating system data sets would 
balance the demand on the individual disk modules. Moving 
the job queue to a third system disk module could lower the 
mean seek wait on МУТІМХ by 2.82 percent (8.52 - 5.7) (Section 
IV, A). This would result in a 2,9 percent increase in 
system throughput (2.82/100-2.82) plus some additional increase 
due to the elimination of arm seek contention on the disk 
module MVTLNX. Balancing the demand on the individual disk 
modules is therefore estimated to represent a 3 to 5 percent 
improvement in system throughput. 

The computer system at the Naval Postgraduate School is 
owned by the U. S, Navy. Using the replacement cost of 
$4.8 million and an estimated 60 months (5 years) of system 
life, a monthly ae cost of (4.8 million/60 months) $80,000 
may be assumed. A 3 to 5 percent improvement thus represents 
a $2400 to $4000 potential savings. A $2400 to $4000 monthly 
savings would pay for the cost of the hardware monitor used 
for the performance measurements in Less than 4 to 8 months 
time. Thus, this one experiment in performance improvement , 
by paying for the hardware monitor, provides the potential 
for future performance measurement efforts at essentially 
no cost. 

Concurrent with the reporting of this research, a later 
version of the operating system known as Release 20 of 


OS/MVT is being implemented at this computer center. A 
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decision has been made to eliminate the use of the disk 
module name LINDA as a user disk module and to use LINDA 

as a third operating system disk module. Thus, the results 
of performance measurement are providing an input to the 
decision making process for configuration changes. It is 
important to suggest that measurements be taken to verify 
the suggested improvement in system performance and to 
determine if the new version of the operating system has 
created any previously unknown problems. 

The preparation and education of the author to conduct 
future performance evaluation analyses of computer systems 
is a result of this research effort which may be of real 
benefit to the Navy. The number of trained analysts in 
computer system performance evaluation is small in contrast 
to a growing need. It does not appear that main frame 
manufacturers are going to expend great effort to assist 
clients in performance optimization through performance 
Measurement as this would probably reduce sales of additional 
equipment. The users therefore will have to train their 
own performance analysts or resort to outside consultants 
in order to use performance measurement to optimize system 
resource utilization. 

Further performance measurement of the computer system 
at this installation would be useful. Questions requiring 
further research include: 


1. What part of the CPU wait only time is spent 
waiting for an operator's console response. 
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2. Does the operator's console activity vary 
widely from shift to shift? 


3. How hbāäs the addition of the IRBM 2321 Data 
Cell affected system performance? 


4. What is the effect of having non-resident 
Supervisor Calls (SVC's) when only 512K 
bytes of core storage is available? 


5. What other parts of the operating system 
‘could be made non-resident? 


In addition to performing specific performance measurement 
experiments, it is recommended that iris computer installation 
establish a ped for periodic system profile measurements. 
Monthly accounting data is currently recorded and presented 
to the analyst in a very usable form. The same amount of 
effort.should be expended to provide monthly hardware per- 
formance profile information to accompany the accounting data. 

This thesis describes the steps taken to improve the 
performance of a computer system. Further improvements in 
performance may be available for the cost of performing 
further analysis. Since each one and a half percent improve- 
ment amounts to $1200 per month increase in computing power 
for the rest of the system life, these improvements should 


be actively pursued. 
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APPENDIX A 
PLOTTING PROGRAM 
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APPENDIX B 
PLOTTING OUTPUT 
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26.5910 2.5920 1.9260 0.1770 5.5750 5.9710 
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APPENDIX C 
CORRELATION MATRICES 
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APPENDIX D 
FIGURE OF MERIT SURVEY 


NAVAL POSTGRADUATE SCHOOL 
Monterey, California 


Dear Sirs: 


As part of a continuing study of computer performance 
measurement, a survey is being undertaken by the Naval 
Postgraduate School 'Computer Science Group". This survey 
will seek to collect information about performance measure- 
ment at other computer installations. We are most interested 
in how performance is measured (hardware monitors, software 
monitors, accounting data), what parameters are measured 
Secu utilization, 1/0 overlap, core utilization), what 
typical or realistic values for these parameters for 


particular job streams, and very importantly, what use is 
made of these results. 


Your cooperation is requested in completing the enclosed 
form as completely and accurately as possible. The results 


of all returned surveys will be compiled and distributed 
to all contributors. 


Sincerely, 


Ge Hi SYNS 
Assistant Professor 
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Installation Name Point of Contact 


(1) Main Frame Designation/Model (2) Own/Lease/Rent 


(3) Disk Units (model) (C4)Number Tape Drives Number hours 
of operation/day 


(7) Core Storage (8) Amount (9) Bulk (slow) Core Stg. (10) Amount 


(Il) Drum (12) Capacity (13) Terminals (I4)Number (15) Oper. 


(time share) Systems 


(16) Printers (17) Card Reader/Punch (18) Other Input/Ouput Devices 


19. Size of user community? (Students, faculty, staff) 





20. Job stream 
a. Jobs/month: 


b. Job size distribution (core used): 
с. Job time distribution (CPU time used): 


2l. Turnaround time 
а. Average per job: 


b. Distribution: 


22. What type of performance measurements are implemented? 


Hh 


a. Hardware monitor? 
1) Model: 
2) Owm/Lease: 
3) Configuration: 
No. probes: 
No. accumulators: 
(counters) 
Recording media: 
(mag tape, paper tape) 
b. Software monitor? 
1) Маше: 
2) Own/Lease: 
3) Capabilities: 
C. Accounting routines 


l) Acquired from manufacturer/locally developed 


23, Are the outputs of any measurement tools used as inputs 


24. 


to any type of configuration simulations? If so, which ones? 


Just what parameters aremeasured in detail? Please give 
yes or no and recent mean values or ranges if possible. 


а УРАНОВИ И тасїоп 
b. Channel utilization 
с. Channel utilization while CPU wait 
4, Device utilization 
transfer 


seek 
queue length 


e. Length of job queue 


maximum 
average 


f. Core segment utilization 
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Job 


25. 


265 


27. 
28. 


29., 


зо. 
SL. 


є. Overhead time (%) 
h. System Data Sets 
Transfer 
Seek 
Queue length 
i. Supervisor Calls 
Active 
Loading 
Inactive 
Stream Data 
jə- Job arr 22] distribution 
k. Distribution of jobs by language 
1. Distribution of jobs by core size request 
т. Distribution of jobs by time requested 
п. Distribution of turnaround time by job time 
о. Distribution of turnaround time by job size (core) 


p. Amount of 1/0 per job 


q. "Cost" per job (or charge schedule) 


Is a full time time-sharing system supported? 
What system? 


If only a part time time-sharing is supported, during 
what hours of the day is it available? 


Is a remote job entry capability supported? 


Can the user monitor the queue status to determine 
where his job is located? 


Are your''customers/users'" satisfied with the performance 
of your computer system? 


How do you know? 


Are the staff/operators satisfied with the performance 
of computer system? 
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22. 
33, 


34, 


How do you know? 


Is there any one overall figure of merit or performance 
index computered by combination of several performance 
parameters? (Please give formula.) 


Which parameters in question 24 do you consider most 


significant as an indication of computer system 
performance? 
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